Large-scale plant protein subcellular location prediction.

نویسندگان

  • Kuo-Chen Chou
  • Hong-Bin Shen
چکیده

Current plant genome sequencing projects have called for development of novel and powerful high throughput tools for timely annotating the subcellular location of uncharacterized plant proteins. In view of this, an ensemble classifier, Plant-PLoc, formed by fusing many basic individual classifiers, has been developed for large-scale subcellular location prediction for plant proteins. Each of the basic classifiers was engineered by the K-Nearest Neighbor (KNN) rule. Plant-PLoc discriminates plant proteins among the following 11 subcellular locations: (1) cell wall, (2) chloroplast, (3) cytoplasm, (4) endoplasmic reticulum, (5) extracell, (6) mitochondrion, (7) nucleus, (8) peroxisome, (9) plasma membrane, (10) plastid, and (11) vacuole. As a demonstration, predictions were performed on a stringent benchmark dataset in which none of the proteins included has > or =25% sequence identity to any other in a same subcellular location to avoid the homology bias. The overall success rate thus obtained was 32-51% higher than the rates obtained by the previous methods on the same benchmark dataset. The essence of Plant-PLoc in enhancing the prediction quality and its significance in biological applications are discussed. Plant-PLoc is accessible to public as a free web-server at: (http://202.120.37.186/bioinf/plant). Furthermore, for public convenience, results predicted by Plant-PLoc have been provided in a downloadable file at the same website for all plant protein entries in the Swiss-Prot database that do not have subcellular location annotations, or are annotated as being uncertain. The large-scale results will be updated twice a year to include new entries of plant proteins and reflect the continuous development of Plant-PLoc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis

The subcellular location database for Arabidopsis proteins (SUBA3, http://suba.plantenergy.uwa.edu.au) combines manual literature curation of large-scale subcellular proteomics, fluorescent protein visualization and protein-protein interaction (PPI) datasets with subcellular targeting calls from 22 prediction programs. More than 14 500 new experimental locations have been added since its first ...

متن کامل

Construction of Subcellular Locations prediction tool for Three Kingdoms: Fungi, Animal and Plant

The information about subcellular location of protein sequence often offers valuable clues to understand the function of proteins. Because, specific functions of proteins are usually related to their appropriate environments in a cell. Recently, the predicted information of subcellular localization of proteins was also used for protein-protein interaction analysis [1]. We already have construct...

متن کامل

Combining experimental and predicted datasets for determination of the subcellular location of proteins in Arabidopsis.

Substantial experimental datasets defining the subcellular location of Arabidopsis (Arabidopsis thaliana) proteins have been reported in the literature in the form of organelle proteomes built from mass spectrometry data (approximately 2,500 proteins). Subcellular location for specific proteins has also been published based on imaging of chimeric fluorescent fusion proteins in intact cells (app...

متن کامل

Protein Subcellular Localization Prediction

ORGANIZATION. Section 1. We first provide the motivation for prediction of protein subcellular localization sites, as well as discuss changes being brought about by progress in proteomics. Section 2. After that, we describe the biology of protein subcellular location. In particular, we explain the principle of protein sorting signals. Section 3. Then we present several experimental techniques f...

متن کامل

Protein subcellular location prediction.

The function of a protein is closely correlated with its subcellular location. With the rapid increase in new protein sequences entering into data banks, we are confronted with a challenge: is it possible to utilize a bioinformatic approach to help expedite the determination of protein subcellular locations? To explore this problem, proteins were classified, according to their subcellular locat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of cellular biochemistry

دوره 100 3  شماره 

صفحات  -

تاریخ انتشار 2007